6.3 Signals of the Cardiovascular System

|

281

In many applications meaningful features are extracted from the raw data to form

feature vectors. Mathematically a feature is an individual measurable property or char-

acteristic of a phenomenon and a feature vector is a n-dimensional vector of numer-

ical features that represent some object [3]. The goal of the classification problems is to

deduce the correct class of an object from its features using appropriate classification

algorithms or short classifiers. The result of a classification is therefore the assign-

ment of a class to each object. As it can happen that a classifier assigns not the correct

class to an object, there are several measures to control the quality of a classifier. The

most important measures are sensitivity, specificity and accuracy. For simplicity, let

us assume that there are two different classes, diseased (A) and healthy (CG). Then

the sensitivity indicates the percentage of a test that detects the disease in people who

are actually ill, whereas specificity indicates the percentage of a test that classifies

healthy people correctly under all healthy people. The accuracy finally is the overall

percentage of the correct assigned classes. The greater the sensitivity, specificity and

accuracy, the better the classifier.

In most cases, the data set of objects is divided in a training and a test set when

a classification is performed. The training data serve to deduce rules for the classific-

ation of the objects. These rules are then applied to the objects in the test set. As the

results depend on the choice of the split in training and test set, the so-called k-fold-

cross-validation is applied: The set is divided in k-sets of equal sizes. In a first step,

the first set is taken as test set, the other sets as training sets. In a second step, the

second set is taken as test set, the others as training sets, and so on. After training the

classifier is tested and the overall quality measures are calculated from those of the

individual steps.

In the above clinical case of the given photoplethysmographic data, several sorts

of coefficients are evaluated as feature vector, however the best results were received

for the coefficients from the frequency response approach:

Let F(ki) be the element of the fast Fourier-transform corresponding to the dis-

crete frequency ki where only complex features for the harmonic frequencies are com-

puted. This is motivated by the observation that there is a clearly visible periodicity

in the spectrum of the PPG-signals and the hypothesis that the effect of an aneurysm

manifest in the periodical properties of the signal and not necessarily in the aperiod-

ical. Therefore the first five harmonic frequencies are extracted by Matlab’s findpeaks

from the absolute values of the corresponding spectrum. These frequencies align in

almost all cases perfectly for input and output signal, if there is a slight deviation

(ki,in

̸= ki,out) the features for the i-th peak are still divided to the resulting coefficients

Hi =

Fin(ki,in)

Fout(ki,out) .

This is done for five intervals of ten seconds and the resulting values (real and complex

parts) are averaged. For the sake of simplicity, we will only consider the case that the

input signal is that at the right thumb and the output signal is that at the right toe.